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SEMANTIC  PROCESSING  FOR  SPEECH  UNDERSTANDING 


SUMMARY 

The  semantic  component  of  the  speech  understanding  system 
being  developed  jointly  by  SRi  and  SOC  rules  out  phrase 
combinations  that  are  not  meaningful  and  produces  semantic 
interpretations  for  combinations  that  arti  The  system  consists 
of  a  semantic  networx  model  and  routines  that  interact  with  it. 
The  net  is  partitioned  into  a  set  of  hierarchically  ordered 
subnets,  facilitating  the  encoding  of  higher-order  predicates  and 
the  maintenance  of  multiple  parsing  hypotheses.  Composition 
routines,  combining  utterance  components  into  phrases,  consult 
network  descriptions  of  prototype  situations  and  sur f ace-to-deep- 
case  maps,  outputs  from  these  routines  are  network  fragments 
consisting  of  several  subnets  that  in  aggregate  capture  the 
interrelationships  between  a  phrase's  syntax  and  semantics. 
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OVERVIEW 

This  paper  describes  aspects  of  the  semantic  component  of 
the  speech  understanding  system  currently  being  developed  Jointly 
by  SRI  and  SDC.  (For  a  comprehensive  discussion  of  nonacoustle 
Portions  of  this  system*  see  Walker  et  ai,,  1975#)  The  semantic 
component  consists  of  two  major  partsi  a  semantic  network  coding 
a  model  of  the  task  domain  and  a  battery  of  semantic  composition 
routines  (SCRs)  that  are  coordinated  with  the  language  definition 
(roughly*  the  "grammar"  for  the  speech  understanding  system?  see 
Paxton  and  Robinson*  1975/  and  Robinson*  1975),  This  paper 
concentrates  exclusively  on  the  interplay  between  these  two  major 
parts  during  parsing.  However*  the  semantic  component  also  plays 
important  roles  in  knowledge  management*  discourse  analysis* 
prediction,  and  question  answering. 

An  SCR  is  called  with  network  representations  of  components 
that  the  associated  language  definition  rule  has  found  to  be 
syntactically  capable  of  combining  to  form  a  larger  phrase. 
Using  knowledge  from  the  semantic  net*  the  SCRs  eliminate 
combinations  that*  although  syntactically  acceptable*  do  not  meet 
semantic  criteria  for  meaningful  unification.  For  combinations 
that  are  acceptable*  the  SCRs  bUlld  network  structures  to 
represent  the  meaning  of  the  composite  phrase*  using  the  network 
structures  of  the  components  as  building  blocks.  These  net 
structures  are  constructed  so  that  (1)  multiple  hypotheses 
concerning  the  proper  Incorporation  of  a  given  utterance 


Semantic  Processing  for  Speech  Understanding 


Page  2 


constituent  in  larger  phrases  may  be  encoded  simultaneously  In 
one  net,  C2)  competing  users  of  a  constituent  may  share  a  single 
network  structure  representing  the  constituent,  and  (3)  the 
association  between  each  syntactic  unit  of  an  input  and  its 
translation  image  in  the  network  is  explicitly  encoded  for  use  in 
discourse  analysis. 


THE  SEMANTIC  NETWORK 

The  semantic  network  is  the  principal  information  source  for 
SCRs#  encoding  such  diverse  entities  as  objects,  situations, 
categories,  taxonomies,  definitions,  and  quantified  statements. 
Network  structures  indicating  possible  relationships  between 
Objects  are  used  to  determine  the  meaningf ulness  of  phrase 
combinations,  while  the  network  itself  serves  as  the  medium  for 
recording  interpretations  of  utterance  fragments  during  Parsing, 
The  structure  of  this  network  differs  from  that  of  conventional 
nets  in  that  nodes  and  arcs  are  partitioned  Into  "spaces".  These 
spaces.  Playing  in  networks  a  role  roughly  analogous  to  that 
played  in  strings  by  parentheses,  group  information  into  bundles 
that  help  to  condense  and  organize  the  network's  knowledge.  An 
Introduction  to  net  partitioning  is  provided  elsewhere  (Hendrix, 
1975), 

An  illustrative  portion  of  the  permanent  knowledge  section 
of  the  semantic  network  Is  depicted  in  Figure  1,  in  the  upper 
left  corner  is  node  *U',  representing  the  universal  set  U,  To 
the  right  is  node  'PHYSOBJS',  representing  the  set  PHTSOBJS  of 
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FIGURE  1  A  SAMPLING  FROM  THE  GENERAL  KNOWLEDGE  NET 
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Physical  objects.  That  PHYS0BJ3  is  a  subset  of  U  is  indicated  by 
the  s-arc  from  'PHYSObJS'  to  'U'.  A  subset  of  PHYSOBJS  is  SUBS, 
the  set  of  aii  submarines.  A  particular  element  of  SUBS,  as 
indicated  by  the  e-arc  from  'DOLPHIN'  to  'SUBS',  Is  the  DOLPHIN, 

The  DOLPHIN  is  a  participant  In  a  particular  situation,  HB, 
the  situation  in  which  the  DOLPHIN  has  a  beam  of  19  feet,  HB  is 
an  element  of  <HAVE,BEAM>,  the  set  of  an  situations  in  which  a 
physical  object  is  characterized  by  a  measure  of  its  breadth. 
Certain  outgoing  arcs  from  a  node  representing  a  situation  are 
used  to  specify  situation  attributes  through  deep  semantic  cases. 
For  example,  the  outgoing  obj-arc  from  'HB'  specifies  the  value 
of  the  "obj"  (object)  attribute  of  HB  to  be  DOLPHIN,  Hereafter 
the  notation  "IBobj"  will  be  used  to  indicate  "the  value  (#)  of 
the  attribute  (9)  obj," 

The  networ)<^  of  Figure  1  has  been  divided  into  five  spaces# 
KS,  S4,  S5,  56,  and  57,  Plctorlally,  each  of  these  spaces  is 
represented  by  a  box.  The  most  global  information  in  the  networJc 
is  encoded  in  space  KS  (the  outermost  box,  sometimes  caned  the 
"Knowledge  Space")  which  Includes  such  entities  as  nodes  'U'  and 
'PHYSOBJS'  and  the  s-arc  connecting  them.  The  boxes  representing 
spaces  S4  through  57  may  be  thought  of  as  holes  in  the  box  of  KS, 
Paralleling  the  relationship  between  «n  inner  and  an  outer  block 
of  an  ALGOL  program,  each  of  these  spaces  specifies  a  more  local 
area  of  the  net  than  is  specified  by  KS,  From  the  perspective  of 
SS,  for  example,  it  is  possible  to  access  both  local  node  'P'  and 
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(relatively)  global  node  'PHYSOBJS',  However,  from  KS  the  nodes 
and  arcs  inside  S5  are  not  accessible.  The  hierarchy  of  space 
localization  may  be  represented  by  a  partial  ordering  such  as 
that  of  Figure  2,  From  any  space  5,  the  nodes  and  arcs  are 
accessible  that  lie  in  s  or  in  any  space  5'  above  5  in  the 
hierarchy.  For  example,  from  53  only  nodes  and  arcs  in  S3,  S2, 
Si,  and  KS  are  accessible, 

Plctoriaiiy,  it  may  be  necessary  to  draw  an  arc  crossing  box 
boundaries,  in  such  cases,  the  arc  belongs  to  the  space  (or 
spaces)  in  whose  box  the  arc  label  is  written.  Spaces  may 
overlap.  For  example,  in  Figure  1,  node  *ED,HB'  lies  in  both 
space  S4  and  space  S5,  Further,  a  space  may  serve  as  a  node  in  a 
more  global  space.  Both  S4  and  S5  behave  as  nodes  in  KS  and  are 
connected  by  a  conse-arc  (consequence). 


SA-3804-34 


FIGURE  2  SPACE  LOCALIZATION  HIERARCHY 
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Typically#  localized  spaces  such  as  S4  and  S5  are  used  to 
encode  higher-order  "predicates,"  such  as  quantifiers,  logical 
connectives,  and  hypothetical  data.  Here,  S4  and  S5  are  used  to 
encode  an  implication.  The  space  S4,  doubling  as  a  node  in  space 
KS,  is  connected  by  an  e-arc  to  '<IMPLT>'  and  by  a  conse-arc  to 
•S5'.  The  interpretation  of  any  element  of  set  <IMPLy>  is  that 
if  entities  can  be  found  matching  the  structure  of  the  element 
space,  then  the  existence  of  entities  matching  the  structure  of 
the  associated  conse  space  may  be  inferred.  The  only  structure 
encoded  in  element  space  S4  Is  a  node  with  an  e-arc  to 
•<HAVE,BEAM>' ,  This  Structure  matches  any  concrete  Instance  of 
<HAVE,BEAM>  (such  as  HB),  Thus,  for  any  instance  of  <HAVE,BEAM>, 
entities  matching  the  structure  of  S5  must  exist.  The  structure 
of  S5  indicates  that  the  element  of  <HAVE,BEAM>  will  have  a 
#@obj,  which  is  an  element  of  PHTSOBJS,  and  a  sBmeasure,  which  is 
an  element  of  LINEAR, MEASURES. 

The  implication  encoded  by  S4  and  S5  serves  to  delineate  the 
set  <HAVE.B£AM>.  That  is,  the  implication  indicates  all  the 
attributes  (deep  cases)  of  a  <HAV£,BEAM>  situation  and  their 
ranges  of  acceptable  values.  This  delineation  may  be  used  during 
parsing  to  test  the  plausibility  of  a  given  group  of  entities 
being  united  in  a  <HAVE,BEAM>  situation  or,  in  a  predictive  mode, 
to  suggest  possible  sentence  participants,  such  delineations  are 
encoded  for  every  situation  and  event  set  Known  to  the  system,  a 
second  example  in  Figure  1  being  the  delineation  of  set  <8UILD>, 
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THE  SYSTEM  IN  ACTION 

The  use  of  the  SCBs  and  semantic  network  in  translation  may 
be  seen  by  considering  the  parsing  of 

"The  power  plant  Of  the  sub  was  built  by  Westinghouse • " 

The  Ultimate  result  of  the  translation  process  for  this  utterance 
is  the  network  structure  recorded  in  the  SCRATCH  space  of  Figure 
3.  Structures  representing  new  inputs  are  constructed  in  a 
scratch  space  (or  spaces)  to  prevent  them  from  becoming  confused 
with  the  system's  permanent  knowledge  (recorded  in  KS),  Since 
the  system  understands  new  inputs  by  appealing  to  previous 
knowledger  there  are  many  links»  in  the  form  of  e»arcs»  from  the 
SCRATCH  space  into  KS,  (Note!  Only  a  fragment  of  KS  is  shown  in 
the  various  figures  of  this  paper.) 


FIGURE  3  PARSE  TARGET  STRUCTURE  FOR  "THE-POWER-PLANT  OF  THE-SUB  WAS-BUILT 
BY  WESTINGHOUSE" 
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The  interpretation  of  the  networJc  in  the  SCRATCH  space  is  as 
follows:  Node  *B'  represents  an  element  of  the  set  <BUILD>  of 
building  events  In  which  a  i8agt  w  built  a  #9obj  P.  The  agent  W 
of  the  building  event  Is  an  element  of  the  set  of  WESTINGHOUSES, 
The  uPobj  built  by  W  is  P»  an  element  of  the  set  POWER, PLANTS, 
According  to  node  'H',  this  power  plant  is  the  #gsubpart  in  a 
<have,PaRT>  relationship  In  which  S,  the  particular  member  of 
SUBS  currently  in  context^  is  the  sBsuppart  (superpart). 
Discourse  analysis  mechanisms  discussed  in  Deutsch  (1975)  and# 
more  fully,  in  Wal)cer  et  al,  (1975)  win  be  used  to  associate  W 
with  the  unique  westlnghouse  Corporation  Known  to  the  semantic 
net  in  space  KS,  The  other  definite  NPs  ("the  sub"  and  "the 
power  Plant  of  the  sub")  win  liKewise  be  resolved. 


To  suppress  secondary  details  while  considering  the  building 
of  this  structure,  assume  the  highly  simplified  language 
definition: 


Grammar 

Pi:  s  =>  NP  VP 

R2:  NP  S>  NP  PREPP 

R3:  VP  ■>  VP  PREPP 

R4:  PREPP  a>  PREP  NP 

(Note:  "the-power*piant 
system.  Rather#  NOM 
"of  the  sub"  and  only  afterward 


Lexicon 

NP:  the-power»piant, 

the«8ub,  Westlnghouse 
VP:  was*buiit 
PREP:  of,  by 


IS  not  treated  as  an  NP  in  the  actual 
power  plant"  is  first  combined  with  PREPP 
is  "the"  appended  to  produce  the 


NP  "the  power  plant  of  the  sub",) 
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In  the  translation  process,  spaces  are  created  to  represent 
the  semantics  of  each  grammatically  defined  constituent  of  the 
total  utterance.  These  spaces  are  shown  in  Figure  4  with  heavy 
arrows  indicating  the  space  hierarchy. 


SA-3804-19R 


FIGURE  4  MULTIPLE  SCRATCH  SPACES  FOR  "THE-POWER-PLANT  OF  THE-SUB  WAS-BUILT 


BY  WESTINGHOUSE" 


Semantic  Prpcessing  for  Speech  Understanding 


Page  10 


At  the  start  of  processing/  space  KS  contains  Knowledge 
about  power-piantS/  <HAVE.PART>  relationships/  submarines/ 
<BUILD>  events/  and  Westinghouse ,  On  spotting  the  noun  phrase 
"the»power-plant " /  an  SCR  is  called  to  set  up  a  space/  NPl/  below 
KS  in  the  partial  ordering*  Within  this  space/  a  structure  is 
created  representing  the  meaning  of  "the-power*plant" * 
Similarly/  new  spaces  are  set  up  to  encode  the  other  sentence 
constituents  that  correspond  to  explicit  lexical  entries. 

As  the  parser  groups  subphrases  into  larger  units/  SCRs  are 
called  to  aid  in  the  process.  Using  rule  R4/  PREPPl  ("by")  and 
NP3  ("Westinghouse")  are  combined  to  form  PREPPl  ("by 
Westinghouse"),  PREPPl  is  allocated  its  own  space/  but  no  new 
structures  are  created  within  it, 

When  syntactic  considerations  suggests  combining  VPl 
( "was-buiit" )  with  PREPPl/  the  appropriate  SCR  is  called. 
Consulting  a  sur f ace-to*deep”case  map  associated  with  the  lexical 
entry  for  the  verb  "build"/  the  SCR  determines  that  a  "by"  PREPP 
following  the  verb  often  signals  the  deep  agt  case  in  a  passive 
construction.  Operating  under  this  hypothesis/  the  SCR  checKs 
the  Voice  of  VPl,  Passing  this  test,  the  SCR  next  cheCKs  the 
semantic  feasibility  of  the  NP  of  PREPPl  serving  as  the  #?agt  in 
a  <BUIliD>  event.  To  do  thlS/  the  SCR  consults  the  #@dellneation 
of  <BUILD>  in  space  KS  (see  Figure  1).  The  delineation  is 
encoded  as  an  <XMPI<y>  situation  in  terms  of  spaces  56  and  57.  As 
discussed  earlier/  this  delineation  indicates  that  any  It^agt  of  a 
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<BU1LD>  situation  must  be  an  element  of  LiE;GAL,PEPSONs,  The 
candidate  for  the  #9agt  position  is  W  of  space  NP3,  Since  w  is 
an  element  of  WE5T1NGH0USES  and  WESTXNGHOUSES  is  a  subset  of 
LEGAL. PERSONS,  w  is  accepted,  A  construction  such  as  "built  by 
the  submarine"  would  have  been  rejected. 

Once  VPi  and  PREPPi  have  passed  the  acceptability  tests,  a 
new  space,  VP2,  is  constructed  to  encode  the  resultant  VP,  This 
new  space  links  node  *B'  of  VPl  with  node  *W'  of  NP3  via  an 
agt«arc.  This  new  arc  is  accessible  only  from  space  VP2  (and 
lower  spaces  in  the  hierarchy)  and  is  not  accessible  from  either 
VPl  or  NP3,  This  leaves  the  components  encoded  in  VPl  and  NP3 
free  to  combine  in  alternatives  to  VP2  if  need  be. 

Continuing  the  Parse#  NP2  ("the-sub")  is  combined  with  VP2 
("was»bullt  by  westinghouse" )  to  form  si,  after  passing  tests 
Similar  to  those  above.  The  obj^arc  linking  the  constituent 
phrases  of  Si  is  contained  in  space  si  and  hence  is  inaccessible 
from  the  spaces  of  the  constituents.  Notice  that  the  construct 
"the-sub  was-buiit  bV  Westinghouse"  wnich  is  encoded  by  51  is  a 
spurious  interpretation  of  utterance  components. 

Using  rule  R4,  PREP  "of"  may  be  combined  with  NP2  to  form 
PREPP2,  The  network  structures  accessible  from  PREPP2  do  not 
Include  the  (spurious)  obj»arc  from  "B'  to  "S'  that  lies  In  space 
Si. 


When  the  syntax  of  rule  R2  suggests  combining  NPi  and  PREPP2 
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to  term  a  new  NP  ( "the-power -plant  o£  the-sub"),  an  SCR  is 
called.  This  SCP  checks  NPi  to  see  if  it  is  relational  in  nature 
(as  is  "beam"  In  "beam  of  the  Dolphin")  and  hence  expecting  an 
argument  to  be  supplied.  Since  NPl  falls  this  test,  the  SCR 
checks  the  properties  of  the  PREP  "of"  and  discovers  that  it  may 
be  used  to  encode  <HAVE,PART>  situations.  Calling  upon  the 
delineation  of  <HAVE.PART>  and  appropriate  surf ace-to-deep-case 
maps,  the  SCR  determines  this  to  be  a  legitimate  interpretation 
and  hence  builds  space  NP4  with  a  node  and  three  arcs  as 
shown.  While  these  new  constructs  are  accessible  from  space  NP4, 
they  are  inaccessible  from  constituents  NPi  and  PREPP2  (and  NP2)# 
Furthermore,  they  cannot  be  accessed  from  spurious  space  Sii 
hence  the  construction  of  NP4  has  not  altered  the  view  of  the  net 
from  SI, 

Using  rule  Bi,  S2  is  constructed  from  NP4  and  VP2,  In 
addition  to  the  obj-arc  contained  in  space  52  Itself,  the  view  of 
the  net  from  S2  includes  all  the  information  accessible  from 
either  space  NP4  or  space  VP2  and  hence  is  identical  to  the  view 
from  space  SCRATCH  of  Figure  3,  since  the  parse  corresponding  to 
space  si  does  not  successfully  account  for  the  fragment 
"the-power-plant  of",  it  Is  rejected,  and  52  is  accepted  as 
expressing  the  meaning  of  the  input. 

The  partial  ordering  of  spaces  from  52  to  K5  indicated  in 
Figure  4  is  identical  to  that  represented  more  clearly  in  Figure 
5,  which,  because  of  the  choice  of  space  labels,  may  be 
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recognized  as  the  parse  tree  of  the  input  sentence.  The  syntax 
of  the  Input  and  the  association  between  each  syntactic  unit  and 
its  corresponding  semantics  has  therefore  been  captured  In  the 
structures  built  by  the  scBs, 


NP1  PREP2  NP2  VP1  PREP1  NP3 


FIGURE  5  SPACE  HIERARCHY  ABOVE  S2 

DISCUSSION 

Partitioning  Is  a  recent  Innovation  In  semantic  networks, 
As  Shown  above#  this  new  feature  enable*  network*  to  maintain 
alternative  hypotheses  (e,g.#  51  and  S2)  concerning  the  use  of 
utterance  constituents  and  enables  such  competing  hypotheses  to 
share  network  subpart*  (e.g,#  VP2),  without  partitioning,  the 
back-linked  nature  of  networks  causes  a  constituent  to  be  altered 
When  it  Is  Incorporated  into  a  larger  unit  and  hence  renders  It 
unusable  In  alternative  constructions.  The  highly  ambiguous 
nature  of  acoustic  Input  makes  these  abilities  to  maintain 
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alternative  hypotheses  and  share  substructures  especially 
important  in  speech  undcrstandlnQt 

Partitioning  also  allows  selected  portions  of  a  network  to 
be  associated  with  syntactic  units#  shewing  the  correspondence 
between  network  entitles  and  the  syntactic  structures  that  were 
used  to  communicate  them.  As  discussed  in  the  section  on 
"Discourse  Analysis  and  Pragmatics"  in  WaiJcer  et  al,  (1975)#  this 
association  is  crucial  in  analyzing  the  elliptic  utterances  that 
are  so  characteristic  of  speech. 
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